Automatic Analysis Of Descriptive Texts

نویسنده

  • James R. Cowie
چکیده

This paper d e s c r i b e s a sys t em t h a t a t t e m p t s to i n t e r p r e t d e s c r i p t i v e t e x t s w i t h o u t the uJe of complex grammars. The pu rpose of the sys t em i s to t r a n s f o r m the d e s c r i p t i o n s to a s t a n d a r d form which may be used as the b a s i s of a d a t a b a s e s y s tem knowledgeable i n the s u b j e c t m a t t e r of the teXt. The t e x t s c u r r e n t l y used a re w i ld p l a n t d e s c r i p t i o n s taken d i r e c t l y from a p o p u l a r book on the s u b j e c t . P r o p e r t i e s such as s i z e , shape and c o l o u r a r e a b s t r a c t e d f rom the d e s c r i p t i o n s and r e l a t e d to p a r t s of the p l a n t in which we a re i n t e r e s t e d . The r e s u l t i n g o u t p u t i s a s t a n d a r d ined h i e r a r c h i c a l s t r u c t u r e h o l d i n g on ly s i g n i f i c a n t features of the d e s c r i p t i o n . The sys t em, implemented i n t he PROLOG p r o gramming l a n g u a g e , u s e s keywords co i d e n t i f y the way segments of the t e x t r e l a t e to the o b j e c t d e s c r i b e d . I n f o r m a t i o n on words is he ld in a keyword l i s t of nouns r e l a t i n g to p a r t s of the o b j e c t d e s c r i b e d . A d i c t i o n a r y c o n t a i n s the a t t r i b u t e s of o r d i n a r y words used by the sy s t e m to a n a l y s e the t e x t . The t e x t i s d i v i ded i n t o seE" ments u s i n g i n f o r m a t i o n p rov ided by c o n j u n c t i o n s and p u n c t u a t i o n . About half the texts processed are correctly analysed at present. Proposals are made for f u t u r e work to improve this figure. There seems Co be no inherent reason why the technique cannot be generalised so chac any text of seml-standard descriptions can be automatically converted to a canonical form.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uma Ferramenta para Identificar Desvios de Linguagem na Língua Portuguesa (A tool to identify the linguistic deviations in the Portuguese Language)[In Portuguese]

Abstract. The revision of formal texts is a complex task and occurs in several areas. The objective of this work is to create a tool to support the revision of texts and promote studies in automatic correction of descriptive texts. We propose a reviewer for automatic identification of language deviations in formal descriptive texts using natural language processing techniques. A case study...

متن کامل

Natural Language Processing And Ihe Automatic Acquisition Of Knowledge: A Simulative Approach

The paper presents the general design and the f i r s t results of a research project whose long term goal is to develop and implement ALICE, an experimental system capable of augmenting i ts knowledge base by processing natural language texts. ALICE (an acronym for Automatic Learning and Inference Computerized Engine) is an attempt to model the cognitive processes that occur in humans when the...

متن کامل

Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation

Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...

متن کامل

Lessons from building a Persian written corpus: Peykare

This paper addresses some of the issues learned during the course of building a written language resource (called ‘Peykare’) for contemporary Persian. After defining five linguistic varieties and 24 different registers based on these linguistic varieties, we collected the texts for Peykare to do a linguistic analysis, including cross-register differences. For tokenization of Persian, we have pr...

متن کامل

Learning to Read Bushman: Automatic Handwriting Recognition for Bushman Texts

The Bleek and Lloyd Collection contains notebooks that document the tradition, language and culture of the Bushman people who lived in South Africa in the late 19th century. Transcriptions of these notebooks would allow for the provision of services such as textbased search and text-to-speech. However, these notebooks are currently only available in the form of digital scans and the manual crea...

متن کامل

The Mediating Role of Automatic Thoughts in Relationship Between Attachment Style with Sexual Dysfunction and Marital Commitment: A Path Analysis

Background: This article explores the effects of attachment style and automatic thoughts on sexual dysfunction and marital commitment, using the path analysis model. This descriptive-correlational study was conducted on 375 married female students in Shahid Chamran University of Ahvaz, Iran, from 2016 to 2017. Methods: According to Morgan and Jersey table and the statistical population (375 pe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1983